Your task is to edit this R Markdown notebook to include your solutions, ensuring that it is well-organized and professional in appearance. Answers must be clear and concise. Start by downloading the Rmd file.
Collaboration Policy. You are allowed to discuss this workbook with your classmates and work in groups. Despite group discussions, each student must write and submit their own solutions.
Assistance Policy. You may ask for clarifications from the instructor and teaching assistant in class. Do not seek help on these workbooks outside of class, as they are intended to be completed during class time.
Submission Requirements. Solutions must be submitted on Brightspace as a PDF writeup. Use the ‘Knit to PDF’ feature in RStudio to prepare your PDF document. Ensure that your PDF document looks like the provided PDF version of the workbook, including all code used to obtain the results in proper code blocks. Make sure your PDF writeup includes your name in the title (replace my name with yours).
Grading Criteria. This workbook is worth 10 points. To achieve a grade of 10, your writeup must correctly answer all questions, be easy to understand, and be formatted correctly.
Work Timeline. You are expected to work on this assignment in class, but you may complete it at home within 24 hours. It is due on August 28, 2024, at 9:30 pm. Late submissions will incur a 50% deduction for any initial delay below 24 hours, and an additional 10% deduction for each additional day.
Remember, you will need to use the fpp3 package to complete this assignment, as shown in the lecture.
suppressMessages(library(fpp3))
Explore the following four time series: Bricks from aus_production, Lynx from pelt, Close from gafa_stock, Demand from vic_elec.
autoplot() to produce a time plot of each series. Describe the patterns seen in each series.Hint: Use ? (or help()) to find out more information about the data in each series.
aus_production
The observations are quarterly.
aus_production |> autoplot(Bricks)
Warning: Removed 20 rows containing missing values or values outside the scale range
(`geom_line()`).
An upward trend is apparent until 1980, after which the number of clay bricks being produced starts to decline. A seasonal pattern is evident in this data. Some sharp drops in some quarters can also be seen.
pelt
Observations are made once per year.
pelt |> autoplot(Lynx)
Canadian lynx trappings are cyclic, as the extent of peak trappings is unpredictable, and the spacing between the peaks is irregular but approximately 10 years.
gafa_stock
Interval is daily. Looking closer at the data, we can see that the index is a Date variable. It also appears that observations occur only on trading days, creating lots of implicit missing values.
gafa_stock |>
autoplot(Close)
Stock prices for these technology stocks have risen for most of the series, until mid-late 2018.
The four stocks are on different scales, so they are not directly comparable. A plot with faceting would be better.
gafa_stock |>
ggplot(aes(x=Date, y=Close, group=Symbol)) +
geom_line(aes(col=Symbol)) +
facet_grid(Symbol ~ ., scales='free')
The downturn in the second half of 2018 is now very clear, with Facebook taking a big drop (about 20%) in the middle of the year.
The stocks tend to move roughly together, as you would expect with companies in the same industry.
vic_elec
interval(vic_elec)
<interval[1]>
[1] 30m
Data is available at 30 minute intervals.
vic_elec |>
autoplot(Demand)
Appears to have an annual seasonal pattern, where demand is higher during summer and winter. Can’t see much detail, so let’s zoom in.
vic_elec |>
filter(yearmonth(Time) == yearmonth("2012 June")) |>
autoplot(Demand)
Appears to have a daily pattern, where less electricity is used overnight. Also appears to have a working day effect (less demand on weekends and holidays).
vic_elec |> autoplot(Demand/1e3) +
labs(
x = "Date",
y = "Demand (GW)",
title = "Half-hourly electricity demand",
subtitle = "Victoria, Australia"
)
Here the annual seasonality is clear, with high volatility in summer, and peaks in summer and winter. The weekly seasonality is also visible, but the daily seasonality is hidden due to the compression on the horizontal axis.
Use filter() to find what days corresponded to the peak closing price for each of the four stocks in gafa_stock.
gafa_stock |>
group_by(Symbol) |>
filter(Close == max(Close)) |>
ungroup() |>
select(Symbol, Date, Close)
The aus_arrivals data set comprises quarterly international arrivals (in thousands) to Australia from Japan, New Zealand, UK and the US. Use autoplot(), gg_season() and gg_subseries() to compare the differences between the arrivals from these four countries. Can you identify any unusual observations?
aus_arrivals |> autoplot(Arrivals)
Generally the number of arrivals to Australia is increasing over the entire series, with the exception of Japanese visitors which begin to decline after 1995. The series appear to have a seasonal pattern which varies proportionately to the number of arrivals. Interestingly, the number of visitors from NZ peaks sharply in 1988. The seasonal pattern from Japan appears to change substantially.
aus_arrivals |> gg_season(Arrivals, labels = "both")
The seasonal pattern of arrivals appears to vary between each country. In particular, arrivals from the UK appears to be lowest in Q2 and Q3, and increase substantially for Q4 and Q1. Whereas for NZ visitors, the lowest period of arrivals is in Q1, and highest in Q3. Similar variations can be seen for Japan and US.
aus_arrivals |> gg_subseries(Arrivals)
The subseries plot reveals more interesting features. It is evident that whilst the UK arrivals is increasing, most of this increase is seasonal. More arrivals are coming during Q1 and Q4, whilst the increase in Q2 and Q3 is less extreme. The growth in arrivals from NZ and US appears fairly similar across all quarters. There exists an unusual spike in arrivals from the US in 1992 Q3.
Unusual observations: